New Crystal Form of Human Neuropilin-1 b1 Fragment with Six Electrostatic Mutations Complexed with KDKPPR Peptide Ligand

Neuropilin 1 (NRP1), a cell-surface co-receptor of a number of growth factors and other signaling molecules, has long been the focus of attention due to its association with the development and the progression of several types of cancer. For example, the KDKPPR peptide has recently been combined with a photosensitizer and a contrast agent to bind NRP1 for the detection and treatment by photodynamic therapy of glioblastoma, an aggressive brain cancer. The main therapeutic target is a pocket of the fragment b1 of NRP1 (NRP1-b1), in which vascular endothelial growth factors (VEGFs) bind. In the crystal packing of native human NRP1-b1, the VEGF-binding site is obstructed by a crystallographic symmetry neighbor protein, which prevents the binding of ligands. Six charged amino acids located at the protein surface were mutated to allow the protein to form a new crystal packing. The structure of the mutated fragment b1 complexed with the KDKPPR peptide was determined by X-ray crystallography. The variant crystallized in a new crystal form with the VEGF-binding cleft exposed to the solvent and, as expected, filled by the C-terminal moiety of the peptide. The atomic interactions were analyzed using new approaches based on a multipolar electron density model. Among other things, these methods indicated the role played by Asp320 and Glu348 in the electrostatic steering of the ligand in its binding site. Molecular dynamics simulations were carried out to further analyze the peptide binding and motion of the wild-type and mutant proteins. The simulations revealed that specific loops interacting with the peptide exhibited mobility in both the unbound and bound forms.


Introduction
Neuropilins (NRP1 and NRP2) are type I single-pass transmembrane glycoproteins expressed in all vertebrates and have many physiological roles. They act as co-receptors of a range of growth factors and other signaling molecules. A recent cryo-electron microscopy structure highlights the role of NRP1 in a ternary complex with a semaphorin protein and a plexin receptor, which altogether mediates signaling in neuronal axon guidance and other processes [1]. NRP1 also forms a ternary complex with vascular endothelial growth factor A 165 (VEGF-A165) and the receptor VEGFR2 [2]. This complexation is associated with intracellular signaling, mitogenesis, cell migration and angiogenesis [3]. Research has shown that NRP1 plays a significant role in the development and progression of various cancer types [4] and also more recently in the infectivity of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [5]. In particular, this transmembrane receptor has been suggested as a molecular therapeutic target for glioblastoma, with overexpression mainly due to endothelial cells of angiogenic phenotype and associated pro-tumor macrophages, both of which are linked to an unfavorable prognosis [6]. A recent review outlines the various functions of NRP1 in the context of cancer treatments [7].
The membrane protein NRP1 (as NRP2) contains a large N-terminal extracellular region (~850 amino acids (AA)), a single transmembrane domain (~25 AA) and a short C-terminal cytoplasmic domain (~45 AA) [8]. The ectodomain consists of five independent domains, where the first four (a1, a2, b1 and b2 domains) are involved in ligand binding, while the role of the last one (c domain) is still under debate (oligomerization, NRP1 homodimerization, etc.) [1]. The interdomain linkers are also important in the heterocomplex formation as spacers [1]. The determination of the crystal structure of the b1 domain provided the first structural insight at the atomic-level into NRP1 [9]. The interaction of NRP1 with VEGF has been extensively studied. Briefly, the NRP1-b1 domain folds into a distorted jelly roll barrel motif that is composed of two beta-sheets [9] (Figure 1a). The strands are connected by loops of varying length. The bottom of the beta-barrel core exhibits a triangular shape that contains an intramolecular disulfide bridge. At the top of the beta-barrel core, the loops are divided in six loop regions (L1-L6) to delimit the pocket of the positively charged tail of VEGF. The C-terminal arginine residue of VEGF-A165 is buried in this pocket and is a key feature of the binding. Numerous atomic experimental structures have been determined to characterize and/or inhibit the interaction of NRP1-b1 and VEGF using peptides or arginine derivatives as well as a fusion protein [10][11][12][13].
We have developed peptides combined with a photosensitizer to target NRP1 in the context of photodynamic therapy (PDT) to detect and treat glioblastoma [14][15][16]. Recently, a nanoparticle was designed that combines KDKPPR motif as a targeting peptide, porphyrin as photosensitizer and gadolinium chelate as contrast agent. This nanoparticle, called AGuIX@PS@KDKPPR, enables the detection of tumor tissue by magnetic resonance imaging and treatment by PDT [6,17,18]. The affinity of the nanoparticle for human NRP1 was validated, and it was found to be ten times lower than that of the free peptide (K D of 4.7 µM for AGuIX@PS@KDKPPR and K D of 0.5 µM for KDKPPR) [17].
In this study, the KDKPPR peptide was synthesized, and its molecular interactions with NRP1-b1 fragment were investigated by X-ray crystallography and molecular dynamics (MD) simulations. Previously, we have attempted to co-crystallize NRP1-b1 with a carbohydrate-based peptidomimetic [19]. However, we constantly obtained tetragonal crystals that were isomorphous to the crystals of the unbound protein. These crystals are unsuitable for obtaining structures of complexes because the site that binds the Cterminal tail of VEGF is obstructed by symmetry-related protein molecules [9]. We have used site-directed mutagenesis to modify the repartition of charges on the surface of NRP1-b1 to induce changes in the crystal packing. This approach is an efficient tool for crystallizing a protein in a new form and facilitating, for example, the formation of a protein-ligand complex in the crystal through co-crystallization or soaking techniques [20]. In this study, we successfully co-crystallized the KDKPPR peptide with a hexavariant of NRP1-b1 (Glu277Lys, Glu285Lys, Asp289Lys, Glu367Lys, Lys373Glu, Lys397Glu). An original crystal form was obtained where the VEGF-binding pocket is filled by the KDKPPR peptide and is located in large spaces connected by solvent channels in the crystal packing. The structure of the NRP1-b1/KDKPPR complex was analyzed by MD simulations and innovative tools based on a multipolar electron density model [21]. the pocket of the positively charged tail of VEGF. The C-terminal arginine residue VEGF-A165 is buried in this pocket and is a key feature of the binding. Numerous atom experimental structures have been determined to characterize and/or inhibit t interaction of NRP1-b1 and VEGF using peptides or arginine derivatives as well a fusion protein [10][11][12][13].

Design of the NRP1-b1 Hexavariant
The charge distribution on the surface of NRP1-b1 was significantly altered in order to promote the formation of a new crystal form, based on a visual inspection of the model. The point mutations were chosen to be far away from the VEGF-binding pocket to minimize disruption of the peptide-binding site ( Figure 1b). Specifically, six charged residues on the protein surface were mutated to residues of opposite charge: Glu277Lys, Glu285Lys, Asp289Lys, Glu367Lys, Lys373Glu and Lys397Glu. Mutation of these residues did not lead to isoelectric conservation. In fact, the estimated isoelectric point of the hexavariant was 1 unit higher than that of the native protein, with a value of 9.2 for the hexavariant and 8.0 for the wild type. These choices resulted in the creation of a highly electropositive region around the Glu285Lys, Asp289Lys and Glu277Lys mutations, consisting of seven positively charged residues (Lys274, Lys277, Lys285, Lys289, Arg334, Arg402 and Lys425). Three of them (Lys277, Lys285 and Lys425) form hydrogen bonds with symmetry-related molecules in the crystal of the hexavariant (see below, Figure 2 and Table S1). The hexavariant (Glu277Lys, Glu285Lys, Asp289Lys, Glu367Lys, Lys373Glu, Lys397Glu) of the NRP1-b1 domain crystallized in the P3221 space group with a novel packing arrangement. The asymmetric unit contained two chains (A and C), each with a KDKPPR peptide (chains B and D) in its VEGF-binding site, plus 377 water molecules and an acetate ion. Only the last three residues of the peptide (PPR) in both monomers were included in the refined structure ( Figure 1a). Positive residual peaks in the difference electron density maps persisted around the first proline residue of the KDKPPR peptide. These peaks were slightly stronger in monomer D. We made several a empts to improve the final 2mFo-DFc electron density map, such as modeling an additional residue in the peptide or modeling alternative conformations for the first proline residue. However, no satisfactory model has emerged from any of these efforts. The two monomers were nearly identical, with an overall coordinate root mean square deviation (RMSD, Å) of 0.30 Å on the 154 Cα atoms common to both chains. Slight conformational differences were observed at the very first and last residues of the two monomers, due to their different involvement in packing contacts. This probably also explained the differences observed at the neighboring disulfide bridge Cys275-Cys424 of the two monomers. Indeed, in monomer A, it showed two alternative conformations, one of which was rather ill-defined, whereas only one conformation was observed in monomer C. The same argument concerning packing effects probably applied for the slight differences observed in some side chain conformations (Phe335) and sometimes in some main chain regions (Glu374-Pro378, Pro396-Pro398). The mutations did not affect the protein fold, since the current   The hexavariant (Glu277Lys, Glu285Lys, Asp289Lys, Glu367Lys, Lys373Glu, Lys397Glu) of the NRP1-b1 domain crystallized in the P3 2 21 space group with a novel packing arrangement. The asymmetric unit contained two chains (A and C), each with a KDKPPR peptide (chains B and D) in its VEGF-binding site, plus 377 water molecules and an acetate ion. Only the last three residues of the peptide (PPR) in both monomers were included in the refined structure ( Figure 1a). Positive residual peaks in the difference electron density maps persisted around the first proline residue of the KDKPPR peptide. These peaks were slightly stronger in monomer D. We made several attempts to improve the final 2mFo-DFc electron density map, such as modeling an additional residue in the peptide or modeling alternative conformations for the first proline residue. However, no satisfactory model has emerged from any of these efforts. The two monomers were nearly identical, with an overall coordinate root mean square deviation (RMSD, Å) of 0.30 Å on the 154 Cα atoms common to both chains. Slight conformational differences were observed at the very first and last residues of the two monomers, due to their different involvement in packing contacts. This probably also explained the differences observed at the neighboring disulfide bridge Cys275-Cys424 of the two monomers. Indeed, in monomer A, it showed two alternative conformations, one of which was rather ill-defined, whereas only one conformation was observed in monomer C. The same argument concerning packing effects probably applied for the slight differences observed in some side chain conformations (Phe335) and sometimes in some main chain regions (Glu374-Pro378, Pro396-Pro398). The mutations did not affect the protein fold, since the current hexavariant model showed an overall Molecules 2023, 28, 5603 5 of 17 RMSD of 0.51 Å with the wild-type structure (PDB entry 1KEX, [9]), compared to a 0.30 Å RMSD between the hexavariant independent monomers (A and C). Four of the mutations (Glu277Lys, Glu285Lys, Asp289Lys and Lys397Glu) resulted in no apparent change in the main chain fold. On the contrary, Glu367Lys and Lys373Glu were located in regions with larger observed displacements: the Cα atom of residue 367 underwent a 1.75 Å shift due to an overall movement of region Ser363-Trp369, while the ψ angle of Gly375 rotated 180 • in the rearrangement of the Glu373-Pro378 loop. Both of these observations are most likely the consequence of the change in crystal packing and protein-protein contacts (Figure 2), rather than the direct influence of the mutations on the polypeptide conformation.

Crystal Packing and Intermolecular Contacts
The mutations induced a new crystal packing in which a few introduced residues (Lys277 and Lys285 in chain A) were involved in modified intermolecular interactions. The two independent hexavariant models (chains A and C) have similar molecular environments. Indeed, more than half of the contacts are identical in both chains (Table S1). The KDKPPR peptide is not involved in the crystal cohesion. The hexavariant crystal contains large solvent spaces (volume of approximatively 30,000 Å 3 , Figure S1) into which it seems possible for small molecules to diffuse because they are connected by solvent channels. The N-terminal parts of the two independent KDKPPR peptides (KDK moiety, chains B and D) are located in the bulk solvent, while their C-terminal parts are tightly bound in the pocket that would welcome the C-terminal tail of VEGF, in chains A and C, respectively (see below). The electron density for the N-terminal regions of chains B and D was too weak to build a model, probably because these regions exhibited dynamic disorder.
We compared the hexavariant crystal form with those of the NRP1-b1 domain available in the Protein Data Bank [22]. Five structures of NRP1-b1 with distinct unit cell parameters were found ( Figure 2, Table S2). The asymmetric units contain one to four independent chains. The crystal structure of the tetragonal form I represents the unbound state of NRP1-b1, in which the VEGF-binding site is obstructed by symmetry-related molecules. All other crystal forms were obtained by co-crystallization of NRP1-b1 with arginine or close derivatives (crystal forms II to V). The intermolecular environment analysis revealed that crystal forms II and III have similar packings (P2 1 space group with two independent chains and P4 1 space group with one independent chain, respectively) ( Figure S2). We also noticed that in all cases, at least one independent ligand was positioned at the interface between two adjacent NRP1-b1 monomers ( Figure S3). Their presence was probably necessary for the cohesion of these crystal packings. That is why we worked with a variant, trying to find a new crystal packing where the ligand only contacts the protein to which it is bound.

Description of Ligand Binding
The KDKPPR peptide was bound in the VEGF-binding pocket formed by loops L1 to L5 of NRP1-b1, with its C-terminal arginine positioned in a very similar manner to what was described in detail by [11]. Briefly, its guanidine group forms a salt bridge involving two hydrogen bonds with Asp320 (L5), while the aliphatic part of its side chain is stacked between the phenyl rings of Tyr297 (L1) and Tyr353 (L3) (Figure 3a). The arginine residue of the ligand is also tightly bound at the main chain by residues of L3, with one of its carboxylate oxygen atoms hydrogen-bonded to the side chain hydroxyl groups of Thr349 and Tyr353 and the second one to the lateral chain of Ser346. The last strong anchoring of the KDKPPR peptide arises from the main chain carbonyl group of the first proline, which forms a hydrogen bond with the hydroxyl group of Tyr297 (L1). The first three residues of the peptide apparently found no preferred binding to NRP1-b1 that could have fixed them in a given conformation and made them visible in the electron density.
Molecules 2023, 28, x FOR PEER REVIEW 6 of 18 of the peptide apparently found no preferred binding to NRP1-b1 that could have fixed them in a given conformation and made them visible in the electron density.
(a) (b) Figure 3. (a) Structure of the binding site of NRP1-b1 hexavariant in complex with the KDKPPR peptide. The pocket is mainly composed of five loops (L1 to L5), which are respectively colored blue, cyan, green, red and turquoise. The crystallographic model of the peptide includes only the PPR moiety, represented as sticks. The NRP1 residues in the close proximity of the peptide are also depicted as sticks. The four structural water molecules are highlighted as spheres, while hydrogen bonds are illustrated as dashed sticks. Various labels are provided to enhance clarity, indicating loops, peptide, residues and water molecules. (b) Nucleophilic Influence Zones associated with the oxygen atoms Asp320-Oδ1 (dark blue), Asp320-Oε2 (light blue), Tyr297-Oη (green) and Glu348-Oε2 (major conformer, orange) in the vicinity of the ligand-binding site of monomer A. The corresponding atomic nucleophilic sites are indicated by colored circles, and the PPR moiety of the KDKPPR peptide is highlighted in yellow.
Ordered water molecules were checked and found identical to those discussed by [11]. Four structural water molecules in the peptide-binding site (HOH A690, A666, A717 and B101 for chain A and HOH C570, C549, C596 and D101 for chain B) were included in calculation of the contacts enrichment ratio (see below, Figure 3a).

Electrostatic Influence of the Protein on the Peptide
The VEGF-binding site of NRP1-b1 is occupied by the PPR moiety of the KDKPPR peptide, with its electrophilic C-terminal arginine residue forming a salt bridge with Asp320. For this reason, it is interesting to study the electrostatic influences from the whole NRP1 protein on the peptide from the point of view of Nucleophilic Influence Zones (NIZ) [23] (Figure 3b). A NIZ represents the volume containing all the electric field lines converging to a specific nucleophilic site, often an oxygen atom in proteins. Therefore, an electrophilic ligand within this space, like a positive charged entity, experiences a ractive electrostatic forces directed toward the corresponding nucleophilic site. NIZs associated with the relevant oxygen atoms involved in the binding of the PPR moiety were calculated excluding ligand atoms and using CHARGER program [24].
As anticipated, the NH2 groups of the arginine guanidinium of the KDKPPR peptide are influenced by their respective hydrogen-bonded oxygen atoms of Asp320 (dark blue surface for Oδ1 atom and light blue surface for Oδ2 atom) (Figure 3b). The NIZ of the The pocket is mainly composed of five loops (L1 to L5), which are respectively colored blue, cyan, green, red and turquoise. The crystallographic model of the peptide includes only the PPR moiety, represented as sticks. The NRP1 residues in the close proximity of the peptide are also depicted as sticks. The four structural water molecules are highlighted as spheres, while hydrogen bonds are illustrated as dashed sticks. Various labels are provided to enhance clarity, indicating loops, peptide, residues and water molecules. (b) Nucleophilic Influence Zones associated with the oxygen atoms Asp320-Oδ1 (dark blue), Asp320-Oε2 (light blue), Tyr297-Oη (green) and Glu348-Oε2 (major conformer, orange) in the vicinity of the ligand-binding site of monomer A. The corresponding atomic nucleophilic sites are indicated by colored circles, and the PPR moiety of the KDKPPR peptide is highlighted in yellow.
Ordered water molecules were checked and found identical to those discussed by [11]. Four structural water molecules in the peptide-binding site (HOH A690, A666, A717 and B101 for chain A and HOH C570, C549, C596 and D101 for chain B) were included in calculation of the contacts enrichment ratio (see below, Figure 3a).

Electrostatic Influence of the Protein on the Peptide
The VEGF-binding site of NRP1-b1 is occupied by the PPR moiety of the KDKPPR peptide, with its electrophilic C-terminal arginine residue forming a salt bridge with Asp320. For this reason, it is interesting to study the electrostatic influences from the whole NRP1 protein on the peptide from the point of view of Nucleophilic Influence Zones (NIZ) [23] (Figure 3b). A NIZ represents the volume containing all the electric field lines converging to a specific nucleophilic site, often an oxygen atom in proteins. Therefore, an electrophilic ligand within this space, like a positive charged entity, experiences attractive electrostatic forces directed toward the corresponding nucleophilic site. NIZs associated with the relevant oxygen atoms involved in the binding of the PPR moiety were calculated excluding ligand atoms and using CHARGER program [24].
As anticipated, the NH 2 groups of the arginine guanidinium of the KDKPPR peptide are influenced by their respective hydrogen-bonded oxygen atoms of Asp320 (dark blue surface for Oδ1 atom and light blue surface for Oδ2 atom) (Figure 3b). The NIZ of the Glu348-Oε2 atom (orange surface) covers the position of one peptide proline residue, while the NIZ of the Tyr297-Oη atom (green surface) encompasses the majority of the other peptide proline residue. Hence, these NIZs illustrate the forces acting on both proline residues, specifically exerting an attraction on their hydrogen atoms, thus stabilizing the ligand conformation by pulling one proline residue towards Glu348 and the other towards Tyr297.
Given the close proximity of the solvent dielectric medium and the distances between the proline residues involved and the generators of the discussed NIZs (i.e., Glu348-Oε2 and Tyr297-Oη), it is likely that the electrostatic stabilization is relatively weak but cannot be disregarded. Electrostatic forces, being long-range interactions, play a role, and the presence of numerous charged hydrogen atoms in the pyrrolidine rings of prolines facilitates favorable interactions with the negatively charged oxygen atoms of the side chains of Glu348 and Tyr297, supporting our interpretation.
However, there are other factors contributing to the stabilization of the KDKPPR peptide that deserve attention. One such factor is the strong hydrogen bond between the hydroxyl group of Tyr297 and the carbonyl oxygen atom of the first proline residue in KDKPPR (Figure 3a). Furthermore, a stabilizing van der Waals contact can be inferred from the proximity of the Glu348-Cγ hydrogen atoms and the Cγ hydrogen atoms in the second proline residue of the KDKPPR peptide.
Furthermore, since electrostatic influences participate in the driving of ligand diffusion across the solvent toward a protein-binding site [25], the NIZs also provide insights on the electrostatic forces originating from the protein residues and directing the ligand during its approach. Here, the NIZs of Glu348-Oε2 and Asp320-Oδ2 atoms extend beyond the protein surface, suggesting their potential contribution to the electrostatic steering effect that facilitates electropositive ligand fixation in the VEGF-binding site of NRP1-b1 (Figure 3b).

Hirshfeld Surface and Contacts Enrichment Ratio
The Hirshfeld surface [26] between the peptide and the protein was calculated with MoProViewer software [27]. The Hirshfeld surface allows the analysis and visualization of intermolecular interactions. The contact enrichment ratio E XY between chemical species X and Y is obtained by comparing the actual C XY contacts with those calculated as if all contact types had the same probability of forming [28]. The equiprobable proportions R XY are derived by probability products from the chemical proportions on the Hirshfeld surface. An E XY enrichment ratio greater than unity for a particular contact between chemical species X. . .Y indicates that these are over-represented. The chemical nature of the contacts and their enrichment in the complex of NRP1-b1 with KDKPPR peptide are shown in Table 1. The proportions of contact types are very similar in the two independent monomers of NRP1-b1 (correlation of C XY contact type proportions of 99.9%).
The less polar Hc hydrogen atoms bonded to carbon were distinguished from the more electropositive Ho/n atoms bound to oxygen or nitrogen. Four structural water molecules in monomer A (and their equivalent in monomer B, see above) in the binding cleft were kept and attributed to the protein in the complex. Obviously, the O. . .Ho/n hydrogen bonds are strongly attractive from an electrostatic point of view and are overrepresented (E = 2.58, Table 1). Representing 17.4% of the interaction surface, they are recognized as the most favored contacts. This concerns interactions between C=O and COO − acceptors and N-H, NH 2 and O-H hydrogen bond donors. The O. . .Hc weak hydrogen bond contacts represent 15.3% of the interaction surface and can be considered as weakly favored contacts, with an enrichment ratio of E = 1.20. Some enrichment ratios close to zero concern the O. . .O and Hn/o. . .Hn/o contacts, which are absolutely avoided in the protein/ligand complex because they concern repulsive self-contacts between charged species.
Occupying the largest contact area (21.3%), non-polar Hc. . .C contacts are significantly over-represented (E = 1.80) and consist in particular of C-H. . .π interactions involving the aromatic rings of Tyr239, Tyr183 and Tyr187. The Hc. . .Hc contacts represent 10.3% of the surface and can be considered as weakly disfavored contacts, as they present an enrichment ratio lower than unity (E = 0.81). The Ho/n. . .W (oxygen atoms of water molecules) contacts involving the four structural water molecules represent 6.9% of the interaction surface and can be considered as significantly favored contacts, with E = 1.88. The four water molecules interact essentially with Ho/n and secondarily with Hc atoms. * The Hirshfeld surface was limited to the regions where the electron density is larger than 0.0013 e/Å 3 in order to omit the peptide surface exposed to the solvent. # Oxygen atom of water molecules. The second and third rows show the chemical content on the Hirshfeld surface. The next rows show the % C XY of the contact types on the surface, followed by their enrichment ratios. The major surface components, the C XY contacts and the significantly enriched contacts (E > 1) are highlighted in bold characters. In the lower part of the table, the atoms are grouped into hydrophobic (Hphob) and hydrophilic (Hphil) atoms.
The hydrophobic and hydrophilic atoms were regrouped in order to analyze the interactions between the two subgroups. The N contact surface occurs on sp 2 peptide and guanidinium nitrogen atoms (N without electron lone pair) and was considered hydrophobic together with the C and Hc atoms. The peptide and protein interaction surfaces are constituted by more hydrophobic (57.1 and 52.6%, respectively) than hydrophilic atoms. The protein/ligand complex shows an enrichment of contacts between hydrophilic atoms (E = 1.26) and between hydrophobic atoms (E = 1.38). On the other hand, despite the mild enrichment of the weak hydrogen bonds of O. . .Hc, the cross contacts Hphob x Hphil are strongly under-represented (E = 0.69), which indicates a good partitioning of hydrophilic and hydrophobic contacts.
In summary, the protein/ligand complex is mainly maintained by over-represented strong O. . .Ho/n interactions, which correspond notably to the salt bridge anchoring Asp206 and the arginine residue of the peptide. Secondarily, more moderately enriched interactions also play an important role, such as weak C-H. . .O hydrogen bonds and hydrophobic contacts, notably between Hc and C atoms. The enrichment values agree with trends found in studies of interactions in several families of oxygenated and nitrogenated hydrocarbon molecules [28,29]. The strong hydrogen bonds such as O. . .Ho/n are significantly enriched; in the case of small-molecule crystals, the over-representation reaches even larger values beyond 10.
Concerning the weak C-H. . .O hydrogen bonds, they tend to occur in a moderately under-represented way in crystal structures of small molecules containing both strong H-bond donors and acceptors (such as alcohols for example), due to the competition of strong H-bonds. On the contrary, in the present protein/peptide interface, they appear slightly enriched. This can be explained by the excess of strong H-bond acceptors on the protein (S O + S W = 31.9%) compared to S Ho/n = 22.9% of strong H-bond donors on the peptide.

Molecular Dynamics
Coordinate RMSD of NRP1-b1 was monitored along the molecular dynamics (MD) trajectories and stabilized after 50 ns in a range of 1-2 Å. Similar coordinate drift was observed for the bound and unbound proteins, as well as for the mutated and wild-type sequences. Only one copy of the mutated protein in complex with the peptide displays slightly larger coordinate drift (Figure 4).    The root mean square thermal displacements (RMSTDs) of the Cα atoms in the crystal structure were derived from the thermal parameters using the formula RMSTD = 8 ⁄ . This estimation is meaningful because we obtained a high-resolution structure (1.35 Å, Table 2) [30]. The values were then compared to the average RMSF obtained from three MD simulations of NRP1-b1 hexavariant in complex with the peptide ( Figure S4). RMSTD and RMSF show similar profiles, indicating a general agreement between both indicators. However, the values of RMSF are noticeably larger (up to 1.7 Å) in the most dynamic regions of the polypeptide chain, while RMSTDs consistently remain below 0.75 Å. One plausible explanation for this discrepancy is that proteins in the crystalline state typically exhibit reduced mobility compared to their counterparts in solution. Overall, RMSF and RMSTD show a correlation coefficient of 0.735, indicating a moderate correlation between the two parameters. The values of RMSF are noticeably larger (up to 1.7 Å) in the most dynamic region of the polypeptide chain, while RMSTDs consistently remain below 0.75 Å. Several MD simulations of the literature [31][32][33][34] have also shown larger fluctuations in solution compared to the crystal environment. Our observations agree with these references. The root mean square thermal displacements (RMSTDs) of the Cα atoms in the crystal structure were derived from the thermal parameters using the formula RMSTD = √ B/8π 2 . This estimation is meaningful because we obtained a high-resolution structure (1.35 Å, Table 2) [30]. The values were then compared to the average RMSF obtained from three MD simulations of NRP1-b1 hexavariant in complex with the peptide ( Figure S4). RMSTD and RMSF show similar profiles, indicating a general agreement between both indicators. However, the values of RMSF are noticeably larger (up to 1.7 Å) in the most dynamic regions of the polypeptide chain, while RMSTDs consistently remain below 0.75 Å. One plausible explanation for this discrepancy is that proteins in the crystalline state typically exhibit reduced mobility compared to their counterparts in solution. Overall, RMSF and RMSTD show a correlation coefficient of 0.735, indicating a moderate correlation between the two parameters. The values of RMSF are noticeably larger (up to 1.7 Å) in the most dynamic region of the polypeptide chain, while RMSTDs consistently remain below 0.75 Å. Several MD simulations of the literature [31][32][33][34] have also shown larger fluctuations in solution compared to the crystal environment. Our observations agree with these references.
Looking more closely at the RMSF profiles ( Figure 5), the loops interacting with the peptide all display fluctuation peaks whether the peptide is absent or present. The loop 347-350 (L3 loop) bearing Glu348 displays the same mobility. By contrast, a group of loops clustered on the other side of the binding side, the loop 296-300 (L1) bearing Tyr297 and Asn300, the loop 310-313 (L2) bearing Glu312 and the loop 318-322 (L5) bearing Glu319, are more mobile in the absence than in the presence of the ligand. Somehow, the two sides of the binding pocket behave differently with respect to the ligand. One moiety (L1, L2 and L5) stabilizes upon binding, while the other moiety (L3) retains the same mobility. A recent MD study by Alshawaf et al. [35] found similar RMSF profiles in NRP1-b1 when complexed with specialized metabolites 3-O-methylquercetin and esculetin. Specifically, the fluctuations in the esculetin/NRP1-b1 complex were similar to those observed in our study of the unbound form, while the 3-O-methylquercetin/NRP1-b1 complex was more similar to our complex with the peptide. Notably, the study found that 3-O-methylquercetin had a more favorable energy of interaction with NRP1-b1 than esculetin [35]. Our "bound" RMSF profile may be indicative to a state of NRP1-b1 that allows stable interactions with a ligand.
with I hi being the intensity of an individual observation of the reflection h and I h being the average of all symmetry-related or replicate observations); 4 CC 1/2 is the correlation coefficient of the mean intensities between two random half-sets of data. 5 of the reflections, R free same formula (5% of the reflections) (F o and F c observed and calculated structure factors, respectively). 6 RMSZ: root mean square Z-score. 7 The MolProbity clash-score is the number of serious clashes per 1000 atoms.
Comparing the fluctuation profiles of the WT and of the mutated sequences of NRP1-b1 ( Figure 5), only one major difference can be noticed: the loop 411-416 (L4), interacting with the peptide, displays a fluctuation peak for one of the unbound trajectories on the WT sequence, whereas this loop is quite rigid in all trajectories recorded for the modified sequence.
The loops 282-289 and 373-378, located in the bottom of the structure, display high mobility in all conditions. The loop 282-289, containing the charged and polar sequence ESGEIHSD, becomes more mobile in the absence of peptide. The sequence ESGEIHSD (residues 282-289) is located at the surface of b1 domain close to the surface of a2 domain in NRP1 structures, which contain a2, b1 and b2 domains (PDB entry 2QQM, [36]) or a1, a2, b1 and b2 domains (PDB entry 4GZ9, [37]). A similar configuration is also visible in the more recent cryo-EM structure of the Sema3A/PlexinA4/NRP1 tripartite complex [1]. As the b1 and a2 interface do not form direct contact in any of these structures, it is difficult to speculate on the precise functional effect of the mobility of the region, but this mobility may have an influence on the propagation of conformational signals during the physiological processes.

KDKPPR Synthesis on Solid Phase
The KDKPPR peptide was synthesized using the automated ResPepXL peptide synthesizer, with a Fmoc/tBu methodology. The side chains of arginine, lysine and aspartic acid were protected by Pbf, OtBu and Boc groups. A Fmoc-Arg(Pbf)-Wang resin swelled in DCM was used. The Fmoc group was removed by a piperidine solution (20% in DMF), and this step was performed two times (the first for 4 min and the second for 7 min). Then, the next AA was grafted by adding an excess of Fmoc-AA-OH (6 eq), HBTU (5 eq), NMP (3 eq) and NMM (10 eq) in DMF, and this step was repeated two times for 18 min. A last step of capping, using a solution of acetic anhydride (5% in DMF), was performed for 5 min to trap all amino functions that did not react. Deprotection, coupling and capping steps were repeated until the end of the synthesis of the peptide. After a last Fmoc deprotection, the resin was dried under vacuum and then cleaved (with full deprotection of lateral chains) using TFA/TIPS/water (92.5/2.5/5, v/v/v) for 2 h. The acidic resin was filtered and washed with DCM and EtOH. The filtrate was dried under vacuum, and the compound was precipitated in diethylether by centrifugation. TSK gel Amide-80 column was used for HILIC purification of KDKPPR peptide using acetonitrile/water (0.1% TFA; 95/5 (v/v) to 55/45 (v/v) gradient) for 15 min, followed by an isocratic elution (0.1% TFA; 55/45, v/v) for 10 min at a flow of 12 mL/min (R t = 20.8 min). KDKPPR was isolated as a white powder with a yield of 64% and a purity of 95% (UV-vis detection at 214 nm).

Crystallization
Crystallization was conducted by sitting-drop vapor diffusion method. The reservoir solutions were prepared by mixing 0.3 µL of commercial reservoir solutions (screens) and 0.3 µL of protein solution. The crystallization trials of NRP1-b1 hexavariant were carried out in the presence of the KDKPPR hexapeptide at 20 • C. Crystals appeared with a JCSGplus solution composed of 0.2 M ammonium citrate, 0.1 M bis-tris pH 5.5 and 25% w/v PEG 3350.

X-ray Diffraction Data Collection and Crystal Structure Determination
Crystals of NRP1-b1 hexavariant that appeared suitable for X-ray diffraction data collection were quickly soaked in their mother liquor supplemented with 20% glycerol (v/v), before flash freezing in a nitrogen stream at 100 K. Preliminary X-ray diffraction experiments were carried out in-house on an Agilent SuperNova diffractometer (Oxford Diffraction, Oxford, UK) equipped with a CCD detector, and high-resolution data were further collected by ESRF synchrotron on beamline BM07 (Grenoble, France). The data set was indexed and integrated with XDS [38], scaled, and merged with Aimless [39] from the CCP4 suite [40]. The atomic structure was solved by molecular replacement using MOLREP [41] with the coordinates of NRP1-b1 wild type (PDB code 5C7G, [19]) as the search model. The structure was manually adjusted with Coot [42] and refined with Buster [43]. Structure validation was performed with MolProbity [44] and the wwPDB validation server (http: //validate.wwpdb.org, (accessed on 22 december 2022)). Diffraction data and refinement statistics are shown in Table 2. Figures of the protein structures were generated with Pymol (Schrödinger LLC, NewYork, NY, USA), MoProViewer [27] and Ligplot+ [45], and cleft volume calculations were performed with 3V [46]. Coordinates and structure factors were deposited in the Protein Data Bank (PDB ID: 8PFE, DOI:10.2210/pdb8pfe/pdb).

Nucleophilic Influence Zones
The Nucleophilic Influence Zones were calculated from the electrostatic potential using Charger module of MoProviewer [24,27]. The electrostatic potential was generated from an electron density model, based on transferred multipolar parameters of the ELMAM2 library [21].

Hirshfeld Analysis
MoProViewer software [27] was used to investigate the intermolecular interactions and the contacts enrichment on the Hirshfeld interface between the protein molecules and the ligand peptides. The intermolecular interactions were evaluated by computing the enrichment ratios (Table 1) in order to highlight which contacts are favored. The enrichment values are obtained as the ratio between the proportions of actual contacts C XY and the equiprobable (random) contacts R xy , the latter being obtained by probability products (R XY = S X S Y ). Contacts X. . .Y, which are over-represented with respect to the share of X and Y chemical species on the Hirshfeld surface, have enrichments larger than unity. They are likely to represent interactions that are attractive from an electrostatic point of view and shall be the driving force in the complex formation [28]. Interactions between atoms that have electric charges of the same sign are repulsive and are generally under-represented (E < 1).

Molecular Dynamics
The protein and peptide chains C and D were selected from the X-ray crystallographic structure. The hydrogen atoms were added, and the flip of side chains was optimized using the Molprobity server [44]. The NRP1-b1 crystal structure with six mutations was unmodified, whereas for the wild-type (WT) system, the mutations Lys277Glu, Lys285Glu, Lys289Asp, Lys367Glu, Glu373Lys and Glu397Lys were introduced to return to the WT sequence of NRP1 for both protein sequences. The bound and unbound systems were simulated.
For each previously described system, the protein was embedded in a water box. Sodium and chloride counterions were added to obtain an ionic concentration of 0.15 M. The total number of atoms was about 31,000 in both cases. All MD simulations were performed using NAMD 2.14 [47], with the CHARMM36 force field [48] for protein and the TIP3P model for water [49]. A cutoff of 12 Å and a switching distance of 10 Å were used for non-bonded interactions, while long-range electrostatic interactions were calculated with the Particle Mesh Ewald (PME) method [50]. The RATTLE algorithm [51] was used to keep rigid all covalent bonds involving hydrogen atoms, enabling a time step of 2 fs. At the beginning of each trajectory, the system was minimized for 20,000 steps, and it was then heated up gradually from 0 K to 310 K in 31,000 integration steps. Finally, the system was equilibrated for 1 ns in the NPT ensemble at 310 K. During the equilibration stage, the Cα atoms were kept fixed. Simulations were then performed in the NPT ensemble (P = 1 bar, T = 310 K), with all atoms free to move. Atomic coordinates were saved every 10 ps. For each trajectory, 200 ns of production and the trajectories were triplicated for a cumulative trajectory duration of 3 µs.

Conclusions
In this study, we have reported the crystal structure of a hexavariant of the domain b1 of human NRP1 in complex with the KDKPPR peptide. The mutant was designed to modify the monomer assembly observed in the crystal packing of the unbound form [9,19], in which the VEGF-binding pocket of NRP1-b1 is inaccessible. Molecular dynamic trajectories permitted investigating the differences in the structures of the wild type and variant. Both structures produced similar internal flexibility and protein/peptide interaction. This showed that the ability of the protein to bind small ligands was not affected by the designed mutations. As part of our future search for ligands, we need to check that the dissociation constant (K D ) of the molecules tested is the same for mutated and wild-type NRP1-b1.
The NRP1-b1 hexavariant crystallized in a new crystal form, in which the KDKPPR peptide was not involved in the cohesion of the solid state. In the crystal, the peptidebinding site was observed to communicate with a solvent cavity large enough to diffuse small molecules. Therefore, ligand soaking in the crystal of the unbound form of NRP1-b1 hexavariant could be considered as a strategy to prepare NRP1-b1 complexes with peptides that target the pocket where VEGF binds.
The structure of the NRP1-b1 hexavariant in complex with the KDKPPR peptide was analyzed with two original tools. First, the Nucleophilic Influence Zones (NIZ) of the ligand-binding cleft were analyzed. They revealed two additional residues (Tyr297 and Glu348) as probable attractors of the ligand electrophilic groups and two residues (Asp320 and Glu348) in the electrostatic steering of the ligand in its binding site. Secondly, the enrichment of contacts was calculated to analyze the interactions between the protein and the peptide. This metric has provided valuable insights into the diversity and specificity of the protein/ligand interaction. The complex was mainly stabilized by a notable presence of strong N-H. . .O and O-H. . .O hydrogen bonds, which were crucial due to the looprich nature of the VEGF-binding site. Indeed, these loops exhibited mobility both in the unbound and bound forms, as suggested by the MD simulation.
Supplementary Materials: The following supporting information can be downloaded at https: //www.mdpi.com/article/10.3390/molecules28145603/s1, Figure S1. Large aqueous cavity in the crystal structure of NRP1-b1 hexavariant; Figure S2: Comparison of the packing of NRP1-b1 in crystal forms II (a) and III (b); Figure S3: Environment of the ligand in crystal forms II, III, IV and V; Figure  S4: Comparison of atomic root mean square fluctuations (RMSFs) and root mean square thermal displacements (RMSTDs); Table S1: Intermolecular hydrogen bonds between NRP1-b1 chains in hexavariant crystal; Table S2: Crystal forms of NRP1-b1 fragment in the unbound form or in complex with small ligands.